Hal: an Automated Pipeline for Phylogenetic Analyses of Genomic Data
نویسندگان
چکیده
The rapid increase in genomic and genome-scale data is resulting in unprecedented levels of discrete sequence data available for phylogenetic analyses. Major analytical impasses exist, however, prior to analyzing these data with existing phylogenetic software. Obstacles include the management of large data sets without standardized naming conventions, identification and filtering of orthologous clusters of proteins or genes, and the assembly of alignments of orthologous sequence data into individual and concatenated super alignments. Here we report the production of an automated pipeline, Hal that produces multiple alignments and trees from genomic data. These alignments can be produced by a choice of four alignment programs and analyzed by a variety of phylogenetic programs. In short, the Hal pipeline connects the programs BLASTP, MCL, user specified alignment programs, GBlocks, ProtTest and user specified phylogenetic programs to produce species trees. The script is available at sourceforge (http://sourceforge.net/projects/bio-hal/). The results from an example analysis of Kingdom Fungi are briefly discussed.
منابع مشابه
Dinoflagellate Genomic Organization and Phylogenetic Marker Discovery Utilizing Deep Sequencing Data
Title of Dissertation: DINOFLAGELLATE GENOMIC ORGANIZATION AND PHYLOGENETIC MARKER DISCOVERY UTILIZING DEEP SEQUENCING DATA Gregory Scott Mendez, Doctor of Philosophy, 2016 Dissertation directed by: Professor Charles F. Delwiche, Cell Biology and Molecular Genetics Dinoflagellates possess large genomes in which most genes are present in many copies. This has made studies of their genomic organi...
متن کاملPhylogenetic Conflict in Bears Identified by Automated Discovery of Transposable Element Insertions in Low-Coverage Genomes
Phylogenetic reconstruction from transposable elements (TEs) offers an additional perspective to study evolutionary processes. However, detecting phylogenetically informative TE insertions requires tedious experimental work, limiting the power of phylogenetic inference. Here, we analyzed the genomes of seven bear species using high-throughput sequencing data to detect thousands of TE insertions...
متن کاملAnalysis of Failure Caused by In-service Welding in anX52 Gas Pipeline
In the research presented in this paper, a failure analysis were carried out to identify causes of an incident, which had taken place after an operation to repair a leak in an interstate natural gas pipeline. In this operation, a partial encirclement reinforcement (patch) was welded to the carrier pipe according to an available hot taping procedure, while gas was flowing in the pipeline. The fa...
متن کاملIMP : a pipeline for reproducible integrated 1 metagenomic and metatranscriptomic analyses
20 We present IMP, an automated pipeline for reproducible integrated analyses of coupled 21 metagenomic and metatranscriptomic data. IMP incorporates preprocessing, iterative co22 assembly of metagenomic and metatranscriptomic data, analyses of microbial community 23 structure and function as well as genomic signature-based visualizations. Complementary use 24 of metagenomic and metatranscripto...
متن کاملAnalysis of Failure Caused by In-service Welding in anX52 Gas Pipeline
In the research presented in this paper, a failure analysis were carried out to identify causes of an incident, which had taken place after an operation to repair a leak in an interstate natural gas pipeline. In this operation, a partial encirclement reinforcement (patch) was welded to the carrier pipe according to an available hot taping procedure, while gas was flowing in the pipeline. The fa...
متن کامل